Skip to content

fix: lazy cross module base class imports for python speedups#5150

Open
dgandhi62 wants to merge 48 commits into
mainfrom
python-imports-6
Open

fix: lazy cross module base class imports for python speedups#5150
dgandhi62 wants to merge 48 commits into
mainfrom
python-imports-6

Conversation

@dgandhi62

@dgandhi62 dgandhi62 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

This pr defers all Python classes & interfaces into factory functions and replaces cross-module imports with lazy proxy objects. This enables us to actually make submodule contents lazy and reduces the time taken to import python submodules by ~55%

The problem can be defined as follows - Before, if a user wrote import aws_cdk.aws_s3, Python had to

  • Execute its cross-module imports at the top of the init file (for example, aws_iam)
  • Build its classes, interfaces, and objects
  • Register all types with the jsii runtime

This lead to a cascading init which would scale as the size of the library grows. To solve this, we introduce lazy factories.

Part 1 - making base classes lazy

All classses, interfaces, and structs are wrapped in @_memoized factory functions, for example:

# Before (eager)
class Bucket(metaclass=jsii.JSIIMeta, jsii_type="aws-cdk-lib.aws_s3.Bucket"):
    ...

# After (lazy)
@_memoized
def _lazy_build_Bucket() -> type:
    class Bucket(metaclass=jsii.JSIIMeta, jsii_type="aws-cdk-lib.aws_s3.Bucket"):
        ...
    return Bucket

getattr on the module triggers the factory on first access, which then subsequently caches the module.

Furthermore, typecheckers like mypy and pyright see the full class shapes via stubs in an if typing.TYPE_CHECKING: block. These stubs have empty bodies and are never executed at runtime. The reason for this is that type checkers evaluate code statically and need to see class definitions with their full signatures. Since the factory functions hide the class inside a closure, type checkers don't like this and therefore we need stubs.

Part 2 - making annotation-only imports lazy

Cross-module imports exist mainly to support type annotations. Since all type annotations are rendered as string literals, they never actually trigger module loading at definition. Therefore, these imports are annotation-only and are only needed at runtime when code actively uses the imported type

We exploit this by placing the real import under if typing.TYPE_CHECKING: (for static analysis) and using a _LazyImport proxy in the else branch that defers importlib.import_module() until first attribute access:

# Before (eager — triggers cascading module loads at import time)
import scope.jsii_calc_base_of_base as _mod

# After (lazy — module only loads when _mod.SomeType is actually accessed)
if typing.TYPE_CHECKING:
    import scope.jsii_calc_base_of_base as _mod
else:
    _mod = _LazyImport("scope.jsii_calc_base_of_base")

Note

With lazy class factories, classes don't register themselves with the jsii runtime until their factory runs. This is a problem when the kernel returns object references with type FQNs which aren't registered. As an example, if the user never directly accessed aws_cdk.aws_s3.Bucket in their Python code (so the factory never ran), the runtime does not know which class to deserialize that reference into.

The _reference_map.py in the Python runtime now handles lazy types. For instance, when the kernel returns a type reference that isn't yet registered, it triggers getattr on the appropriate module to materialize the factory, which registers the type as a side effect of class construction via JSIIMeta.

EDGE CASES

This section is basically weird things I encountered which my solution also accounts for in the code.

Same-module base class references

When class B(A) and A are both in the same module, A doesn't exist yet when B's factory runs (both are deferred). The generated code replaces bare base class names with factory calls:

class B(_lazy_build_A(), metaclass=jsii.JSIIMeta, ...):
    ...

typing.get_type_hints() needs all deferred classes resolved

jsii-pacmak generates runtime type-checking code that validates method arguments at call time. This includes the typechecking call and the typechecking stub which you can see at the end of a module. typing.get_type_hints() looks at the stubs annotations dict and resolves each string in a namespace. Previously, stubs had unquoted annotations, and typing.get_type_hints() was called with no explicit namespace (which I think it defaulted to the stub's globals). This was fine becuse all classes existed in globals() at module scope. Now classes are deferred. They don't exist in globals() until their factory runs.

To solve this, we introduce _get_typechecking_ns(), which builds a dict that's a copy of globals() plus every deferred class (by calling all factories). We pass this dict as globalns so get_type_hints() can find any name.

Before

if __debug__:
    type_hints = typing.get_type_hints(_typecheckingstub__abc123)
    check_type(...)

After

if __debug__:
    type_hints = typing.get_type_hints(_typecheckingstub__abc123, globalns=_get_typechecking_ns(), localns=_get_typechecking_localns())
    check_type(...)

qualname issues

Python automatically sets qualname based on where a class is defined lexically. A class inside a function gets function_name..ClassName. Since we define classes inside lazy_build* functions, they'd show up as _lazy_build_Bucket..Bucket in error messages (an example), tracebacks, and repr() instead of just Bucket.

The fix is a single line after the class is created inside the factory:
Bucket.__qualname__ = "Bucket"

Intersection type helper classes

When a parameter is typed as A & B in TypeScript, pacmak generates a synthetic Protocol class that extends both interfaces (for instance, think class _ISomething_IFriendly(ISomething, IFriendly, Protocol). These helper classes were previously emitted at module scope where their base classes already existed in globals(). With lazy factories, the base classes are now factories, which causes a NameError.

The fix basically moves the helper classes into the else block of if typing.TYPE_CHECKING and replaces same-module base class names with their factory call.

# Before (broken — ISomething not in globals at import time)
class _ISomething_IFriendly(ISomething, _other_pkg.IFriendly, typing_extensions.Protocol):
    pass

# After (works — factory call materializes ISomething)
class _ISomething_IFriendly(_lazy_build_ISomething(), _other_pkg.IFriendly, typing_extensions.Protocol):
    pass

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mergify mergify Bot added the contribution/core This is a PR that came from AWS. label Jun 4, 2026
@dgandhi62 dgandhi62 changed the base branch from main to python-imports-4 June 5, 2026 00:40
@dgandhi62 dgandhi62 changed the base branch from python-imports-4 to main June 5, 2026 01:00
@dgandhi62 dgandhi62 force-pushed the python-imports-6 branch 3 times, most recently from 800aedb to 11e28fb Compare June 8, 2026 20:05
@rix0rrr

rix0rrr commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Nice! Couple of questions:

Backwards compat

Are there any consequences for backwards compatibility? We are changing both the generated code and the runtime here. We can assume new gencode = new runtime, but we cannot assume the reverse: the new runtime code be interacting with old codegen'ned code. Spare a moment to think about what the consequences of that are. Anything that might break?

Qualname

Don't forget that we have classes inside classes as well, for example in the AWS CDK L1s.

So for:

def lazy_build_CfnBucket:
  class CfnBucket:
    class LifecycleRuleProperty:
      pass

   pass

The __qualname__ of that inner type should still be CfnBucket.LifecycleRuleProperty, not LifecycleRuleProperty. Do we have that correct?

get_type_hints

You wrote:

To solve this, we introduce _get_typechecking_ns(), which builds a dict that's a copy of globals() plus every deferred class (by calling all factories). We pass this dict as globalns so get_type_hints() can find any name.

Does that mean the first time we call a function and we do run-time type checking, we resolve all lazy class factories after all?

Comment on lines +97 to +106
if python_module is None:
try:
importlib.import_module(root_module)
python_module = root_module
type_parts = parts[1:]
except (ImportError, ModuleNotFoundError):
return False

if python_module is None:
return False

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this behavior fall out of the previous for loop automatically?

Seems to me there are 2 steps to a type lookup:

  • Find the most specific "module-like" object to contain the type (can be the root module if nothing more specific)
  • Then traverse that module-like object using .-separated lookups of the remaining type parts.

At the end, we should actually have that object in a variable directly. We don't need to rely on the @jsii decorator having stuffed the type into a global map and looking for it there. The traversal should will have given us the type directly.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function might end up a lot nicer (and easier to understand) if you split those phases up into separate functions.

@dgandhi62 dgandhi62 Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That... makes sense. Fixed. I split it up into three phases

  • find the Python module, import it, return (module_name, type_parts) or None
  • traverse the module via getattr() to trigger lazy factories, return bool
  • call Phase 1, check registry, call Phase 2 if needed

Comment thread packages/@jsii/python-runtime/src/jsii/_reference_map.py Outdated
Comment on lines +1623 to +1624
for name, factory in _LAZY_CLASSES.items():
ns[name] = factory()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is de-lazyifying everything, isn't it?

Perhaps we're better off using another object like _LazyImport where the __getattr__() will trigger the factory, and using that a namespace. And we can also use that for the base classes etc.

THIS_MODULE = _LazyAccess()

def _build_Consumer(): 
  class Consumer(THIS_MODULE.BaseClass):
    ...

THIS_MODULE.register('Consumer', _build_Consumer)


def __getattr__(key):
  return THIS_MODULE[key]

Or even:

THIS_MODULE = _LazyAccess()

@THIS_MODULE.register_type('Consumer')
def _build_Consumer(): 
  class Consumer(THIS_MODULE.BaseClass):
    ...

@dgandhi62 dgandhi62 Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it de-lazifies the module, but only on the first type check in debug mode (python -O skips it entirely). Also it's per-module, not global.

I kept running into some Python test errors earlier. Let me double check and tell you more accurately

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constraint (I think) is, that on Python 3.12-3.13, ForwardRef._evaluate calls eval(code, globalns, localns). We need to pass a separate localns (plain dict, different object than globalns) to prevent the homonymous caching issue mentioned below. But when eval() receives a plain dict as localns, it checks localns first for the name. If it's not there, it raises NameError without ever falling through to check globalns.missing. So even if globalns were a lazy dict, the missing never fires.

On Python 3.14, the ForwardRef caching and Union interning that cause the homonymous issue are both gone, so the separate localns isn't even needed there. And 3.14's ForwardRef.evaluate() would actually work with a lazy dict. But we need to be compatible with everything.

Comment on lines +1630 to +1633
if _typechecking_localns is None:
ns = dict(globals())
for name, factory in _LAZY_CLASSES.items():
ns[name] = factory()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks the same as the other one?

@dgandhi62 dgandhi62 Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was kinda confused with this as well. But that is intentional, one for globalns and one for localns. It prevents Python's ForwardRef from reusing a cached resolution from a sibling module that has a type with the same name. Specifically we have a test (test_homonymous_forward_references) which fails for this

@dgandhi62 dgandhi62 Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is your friendly neighbourhood kiro's generated example for this.

Module foo/                          Module bar/
─────────────────────────           ─────────────────────────
class Homonymous:                   class Homonymous:
    string_property: str                numeric_property: int

def stub(x: "Homonymous"): ...     def stub(x: "Homonymous"): ...
  1. foo calls typing.get_type_hints(foo.stub, globalns=foo_ns) — no localns passed
  2. Python sets localns = globalns (same object)
  3. ForwardRef resolves "Homonymous"foo.Homonymous, caches it, sets __forward_evaluated__ = True
  4. bar calls typing.get_type_hints(bar.stub, globalns=bar_ns) — no localns passed
  5. Python sets localns = globalns (same object)
  6. ForwardRef checks: not __forward_evaluated__ → False. localns is not globalns → False (same object).
  7. Returns cached foo.Homonymous → wrong type → TypeError

It seems that Python reuses typing.Union objects in memory, so two stubs in different modules with the identical annotation typing.Union["Homonymous", Dict[str, Any]] share the same ForwardRef object in memory

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically now, localns is not globalns is always True → ForwardRef always re-evaluates → each module gets its own correct Homonymous.

@dgandhi62

Copy link
Copy Markdown
Contributor Author

To your first comment,

I fixed the qualname point you mentioned and addressed 3 in a different comment. For backward compatibility, there should not be issue (?). The new code adds exports but the old code never references them. The changes in the runtime add code paths when eager registering doesn't happen, so old code should still work. Still a work in progress, its a big change so I'm trying to think through all the ramifications.

@dgandhi62 dgandhi62 force-pushed the python-imports-6 branch 2 times, most recently from e31270b to 43addb2 Compare June 10, 2026 15:21
@dgandhi62

dgandhi62 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

For the question of backward compat, I think this saves us @rix0rrr -

`jsii${toPythonVersionRange(`^${VERSION}`)}`,

There is a minimum runtime jsii dependency declared. Which I would understand to mean that newly generated code will require the new runtime AND only runtime wont work.
The reverse (new runtime with old code) should not be an issue because the changes in runtime are just additions. With the old code, the new paths should not be triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution/core This is a PR that came from AWS.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants